Align KTO with DPO: Remove enforcement of causal language models by albertvillanova · Pull Request #5701 · huggingface/trl

albertvillanova · 2026-05-05T11:54:56Z

Align KTO with DPO: Remove enforcement of causal language models.

This PR makes a small change to the KTOTrainer by removing the check that raised an error when an encoder-decoder model was used. As a result, the restriction that KTO only supports causal language models is no longer enforced in the code.

Part of:

KTO refactoring #4786

@qgallouedec, shouldn't I enforce this in DPO instead?

Note

Medium Risk
Removes a guardrail in KTOTrainer that previously blocked encoder-decoder models, which may expose unsupported/untested model architectures to KTO training and lead to runtime errors or incorrect loss computation.

Overview
KTOTrainer no longer raises an error when the provided model is configured as encoder-decoder, aligning its initialization behavior with DPO by removing the causal-LM-only enforcement.

^{Reviewed by Cursor Bugbot for commit 577037c. Bugbot is set up for automated code reviews on this repo. Configure here.}

HuggingFaceDocBuilderDev · 2026-05-05T11:57:23Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Remove error for encoder-decoder models

577037c

qgallouedec approved these changes May 5, 2026

View reviewed changes

albertvillanova merged commit babb16b into main May 5, 2026
6 checks passed

albertvillanova deleted the align-kto-dpo-rm-encoder-decoder branch May 5, 2026 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align KTO with DPO: Remove enforcement of causal language models#5701

Align KTO with DPO: Remove enforcement of causal language models#5701
albertvillanova merged 1 commit into
mainfrom
align-kto-dpo-rm-encoder-decoder

albertvillanova commented May 5, 2026 •

edited by cursor Bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

albertvillanova commented May 5, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

albertvillanova commented May 5, 2026 •

edited by cursor Bot

Loading